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ABSTRACT 

The  present  paper  is  concerned  with  the  estimation  of  the 
transition  distributions  of  a  Markov  renewal  process  with  finitely 
many  states,  which  extends  and  unifies  some  aspects  of  the  results 
in  the  special  cases  of  discrete  and  continuous  parameter  Markov 
chains.  A  natural  estimator  of  the  transition  distributions  is 
defined  and  shown  to  be  consistent.  Limiting  distributions  of 
this  estimator  are  derived.  A  density  for  a  Markov  renewal  process 
is  developed  to  permit  the  definition  of  maximum  likelihood 
estimators  for  a  renewal  process  and  for  a  Markov  renewal  process. 
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INTRODUCTION 


The  general  theory  of  statistical  inference  in  Markov  processes 
began  with  Bartlett’s  paper  in  1951#  [?.]•  Later  developments  are 
presented  in  Billingsley’s  book  [4]  and  his  expository  paper  [5]# 
both  of  which  appeared  in  1961#  We  refer  in  particular  to  the 
development  of  maximum  likelihood  estimators  for  the  transition 
probabilities  of  a  Markov  chain,  either  discrete  or  continuous 
parameter,  by  Billingsley  [4]  and  more  recently  by  Albert  [l]  in 
1962.  The  present  paper  is  concerned  with  the  estimation  of  the 
transition  distributions  of  a  Markov  renewal  process  with  finitely 
many  states,  which  extends  and  'unifies  some  aspects  of  the  results 
in  the  special  cases  of  discrete  and  continuous  parameter  Markov 
chains.  In  Chapter  2  a  natural  estimator  of  the  transition  dis¬ 
tributions  is  defined  and  shown  to  be  consistent,  Limitinr  dis¬ 
tributions  of  this  estimator  are  derived  in  Chapter  3.  A  density 
for  a  Markov  renewal  process  is  developed  in  Chapter  4  to  permit 
the  definition  of  maximum  likelihood  estimators  for  a  renewal  process 
in  Chanter  5  and  for  a  Markov  renewal  process  in  Chanter  4, 


1.  PRELIMINARY  CONCEPTS  AND  DEFINITIONS 

The  constructive  definition  riven  in  [ll]  of  a  Markov  renewal 
process  (MRP)  with  m  (<  ®)  states  ’  s,  ir’o.'ly  as  follows.  One  is  ri 
matrix  o'’  transition  distributions  (Q.  .)  where  each  .  Is  a 

ij 

mass  function  defined  on  satisfying  0^.(x)  =  0  for  x  '  0 

arv*  -v  •  (“)  “1,  (l  £  i  ^  m).  One  is  also  riven  an  m- tuple  of 

j=l 


ver 


2 


initial  probabilities  satisfies  p.  2  0 

m 

and  >  p,  =  1*  Consider  any  two-dimensional  Markov  process 
"j=l  J 

{ \  n  ^  defined  on  a  complete  probability  space  that 

satisfies  X  =  0  (a.s*),  P[J  =  k]  =  p,  and 
o  9  o  rk 


p[Jn  =  k,  Xn  <;  x|Jo,J1,...,Jn_1,x1,...xn_1]  =  0jn_i>k(*) 


for  all  x  e  (-®,00)  and  1  <  k  ^  m.  The  matrix  (P^j)  *s  defined 


-1 


by  p«  =  V*0,  If  pij  >  °»  set  Fu  =  Pif*  v  uhlle  lf 

p^j  =  0,  then  let  F^.  be  arbitrary*  The  integer-valued  stochastic 
processes  (U(t)$  t  2  o],  {N.(t)$  t  2  o],  and  {il^t);  t  ^  o]  are  defined  by 


n 


JJ(t)  =  sup  fn  2  Os^S1  X.  ^  t] ,  N,(t)  =  the  number  of  times 

1=0  1  J 

=  j  for  1  £  k  £  N(t),  and  IJ^(t)  =  the  number  of  time3 
=  i  and  =  j  for  1  £  k  <,  N(t)*  Then  the  stochastic 

process  [N^t),!!^),***,^^)  j  t  2  o]  is  called  a  Markov  renewal 
process  determined  by  the  given  initial  probabilities  and  matrix  of 

transition  distributions. 

Hie  following  consequences  of  the  above  definitions,  derived 
in  [ll],  will  be  used  below. 


P[Jn  =  Jl  V~’Jn-2'Jn-l  =  i]  =  P: 


ij 


(1.1)  <  P[Xn  <  x|Jo,...fJn_,,Jn_1  =  1,  Jn  =  j]  =  F^x) 


P[Xni  ^  xl»”,'Xnv,  ^  y-klJ„!  n^01 


Afj  <*i> 

i-1  ^-1, 


for  0  <  n^  <  •••  <  n^,  the  last  equality  holding  with  probability 


one, 


20 


3 


It  is  assumed  throughout  that  the  !THP  is  irreducible, 
recurrent,  and  that  F^  =  for  1  <  j  ^  m*  This  la3t 
assumption  incurs  no  loss  of  generality  as  is  pointed  out  in  [12J. 


Estimators  for  the  transition  probabilities  0^(x)  are 
defined  on  sample  functions  of  the  MRP  over  [0,t).  These  sample 
functions  of  the  MR?  are  equivalent  to  the  sample  functions 

(J0,Ji»###,JN(t),Xl,X2,#**,XIi(t) )#  Le"  denote  the  holding  tim< 

of  the  visit  to  state  i,  that  is  the  ( X, 1  <  i  <  m, 

1  £  j  £  II.  (t)}  are  just  a  relabeling  of  [X. }  1  £  i  £  !J(t)}. 

X  X 


2.  DEFINITION  AND  CONSISTENCY  OF  A  NATURAL  ESTIMATOR 
Consider  the  estimator  defined  by 


(2.1) 


QjjUjt)  =  p,  ,(t)  r,(x;t). 


where  t,  x  >  0, 

(2.2)  p,  ,(t)  =  M,,(t)/:;,(t), 

n  111  (t) 

(2.3)  Hi(xjt)  =  n.(t)”1  2  e(x  “ 

k=l 

and  where  e(u)  equals  one  if  u  2  0  and  zero  otherwise.  That  is, 
^(x;t)  is  the  ordinary  empirical  distribution  function  but 
determined  from  the  sample,  of  random  size  N. (t),  of  the  holdir.r 

A 

times  in  state  i.  Interpret  n..(x;t)  to  be  zero  if  II.  (t)  =  0. 
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The  estimator  (2.1)  is  a  natural  combination  of  estimators 
used  in  Markov  chain  inference  and  in  classical  inference  for  fixed 
sample  size.  Derman  [8]  has  studied  p^(t)  as  an  estimator  for 
the  transition  probabilities  of  a  Markov  chain,  with  the  small 
difference  that  the  total  number  of  transitions,  N(t), 

is  not  random.  The  empirical  distribution  function  for  non-random 
sample  size  has  been  studied  extensively  (c.f.  Darling,  [7]). 

Consistency  of  (2.1)  and  the  limiting  distributions  of  (2.1) 
are  obtained  using  the  general  limit  theorems  for  MRP  developed  by 
Pyke  [12].  In  [12]  the  limiting  behavior  (as  t  -*  ®)  of  sums  of 
the  form 


N(t) 

(2.4)  yt)  =2 

n=l 

is  studied  for  real  valued  functions  f  defined  on  the  state  space 
of  an  MRP.  We  recall  the  notation  used  in  [12].  Ujj  and 
denote  the  first  moment  of  the  distribution  of  the  first  passage 
time  from  state  i  to  state  j  of  the  MR?  and  of  the  corresponding 
Markov  chain  [Jn?  n  £  0} ,  respectively.  Define  recurrence  indices 

rj#s  b7  rJ,0  =  0  and»  for  3  ^  l* 

rj,s  =  sup  (1  ^  k  ^  k  >  ^  *  j(rw  <  i  <  k)}. 

The  sequence  of  random  variubles  (r.v.’s)  [Uj  g:  s  >  0]  is 
defined  by 


5 


(2.5) 


U 


if* 


■J#3+l 

2  f(Jn-l«Jn»Xn>' 


n=rJ,=+1 


That  is,  U,  is  the  contribution  to  the  3ir  W~(t)  obtained 
j,s  i 

th  th 

between  the  s  and  the  (s+l)  occurrence  time  of  state  J. 

The  random  variables  {u.  ;  s  £  l]  are  independent  and  identically 

J  Is 

distributed.  Set 


vik 


00  ■ 

=  f  f(i,k,x)  aQlk(x),  a,  =2)' 
o  k=l 


Ik 


/.  » 

tf(i,k,x)]2aojk(x),  h1  =V 3iy. 


When  the  mean  and  variance  of  U.  .  exist,  they  will  be  denoted 

1,1 

2 

by  m^,  and  d^  respectively.  Since  m  is  finite,  it  follows 
from  [12]  that  when  they  exist,  they  are  given  by 


n 


(2.6) 


n, 


VA 1 11 


# 

rr 


r=l 


and 


(2.7) 


m 


-mt  +2  Vi!  / 

r-1 


rr 


ik  ‘  sk"1  rr  “kk 
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Theorem  2.1i  The  estimator  (2.1)  is  uniforaly  strongly  consistent 
as  t  -*  ®  in  the  sense  that  with  probability  one, 

(2.8)  max  sup  |  Q..(xjt)  -  Q.,(x)|  -•  0. 

i,J  x  1J 

Proofs  Rewrite  (2.8)  as 


£  max  |  II. ,  (t)/N.  (t)  -  p, ,  |  +  max  sup  |  H,  (xjt)  -Hj(x)|. 

i,J  1J  1  1J  i  x  1 

Since  N^(t)  ®  (a.s.)  by  the  regularity  of  the  MRP,  then  one 

concludes  from  the  Glivenko-Cantelli  theorem  for  non-random  sample 
sizes,  that  sup  |  H^(xjt)  -  H^(x)i  -0  (a.s.).  The  proof  is 

3C 

completed  by  showing  [N^ (t)/N^ (t)  -  p^]  -  0  (a.s.)  for  1  £  i,  j  <>  m. 
Let  k^  denote  the  state  visited  after  the  visit  to  state  i. 

Then 

N±(t)-1  H^t) 

(2.9)  2 

4  =  1  Jt=l 

where  5,  ,  denotes  the  Kronecker  delta 

K,J 

and  by  the  Strong  Law  of  Large  Numbers  both  the  right  and  left  hand 
sides  of  (2.9),  when  divided  by  N^(t),  converge  to  with 

probability  one. 
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3.  ASYMPTOTIC  DISTRIBUTION  OF  THE  NATURAL  ESTIMATOR 

The  limiting  distribution  of  (2.1),  (2.2),  (2.3)  cun  be 
obtained  by  applying  the  central  lir.it  theorer.  1’or  Tunc t ions  on  an 
MRP  (c.f.  Lena  7.1,  [12]). 


Theoren  3.1:  For 
converges  in  lav;  a 
zero  and  covarianc 


fixed  i,j,x,  (t*[p,,(t)  -p..], 

o  t  -•  *  to  a  bivariate  norr.al 

3  matrix  (e,,)  given  by 
-*-.1 


t4[H. (x;t)  -  H. ( x 


r.v.  with  means 


(3.1)  a 


11 


lLiipi^1"?ij^  d 


22 


|i  4  <H.  (x)  [l-H.  (x)  ] ,  -  tf-  0  . 


T  0-“- 

r  »  m  LiU  o 


an  i 


be  arbitrary  constants.  To  nr've  the 


as;, not  otic  joint  normality  it  suffices  to  show  that 


(3.2)  w,  t*vp.  .(t)  -  p,  .]  +  v0ts[H. (x;t)  -  H  (x) 

-*■  1.  •-  i  1 


cor.ver res  in  lav;  to  a  normal  r.v.  for  all  v, 
rewrite  (3.2)  as  the  product  of  ]t/N,(t)]  and  a 
(2.1)  by  using  the  function  f  lefined  by 


and 


e 


m  of  the  f orm 


(3.3)  f(r,s,y)  =  {v/.,[6^  -  p,  .]  +  wjc(x-y)  -  !!,  (x)]}  6  . 


For  the  function  (3.3) 


8 


Ar  =  ul6ri[prj  ‘  Pij1  +  u26ri[Hr(x)  *  Hi(x)1  =  0 


and 


Br  =  +  p,^2  -  rPrPrj]  +  w2  ^Hr(x)  +  Hi'  (x)  -  2Hr(x)H1(x)]]6rl 


for  1  £  r  £  mj  hence  m^  =  0  and  the  third  sum  in  (2*7)  is  zero. 

Then  the  variance  of  U.  .  is 

1  pi¬ 


rn 


4  =2  vh/i1 


tt 

rr 


v^jtl  -  pAj]  +  w^H1(x)[l  -  H1(x)] 


r=l 


2 

The  variance  <?3  is  finite,  so  from  Lemma  7.1  of  [12]  the 
limiting  distribution  of  t“^  Wr(t)  for  the  f  given  in  (3.3) 

X 

2  / 

is  normal  with  zero  mean  and  variance  But  t/N^(t)  -^^(a.s.) 

so  the  limiting  distribution  of  (3.2)  is  normal  with  zero  mean  and 

2 

variance  as  required. 


The  zero  correlation  between  p^.(t)  and  H^(xjt)  yields  the 
following  result. 


A  A 

Corollary  3.2:  For  fixed  i,j,s,  p^.  (t)  and  H^(xjt)  are 
asymptotically  independent. 


The  asymptotic  normality  of  (3.2)  can  be  used  to  obtain  the 

A 

limiting  distribution  of  Q^(xjt). 

X  A 

Corollary  3.3:  For  fixed  i,j,x,  t® [ (x| t )  -  Q^(x)]  converges 

to  a  normally  distributed  r.v.  with  mean  zero  and 


in  law  as  t 


00 


o » 


Q 


variance  ecual  to 


(3.4) 


U ,  .H.  (x)p.  .  [H.  (x)  -  2H,  (x)p. ,  +  p,,], 


Proof:  We  rewrite  t®[Q,.'x)  -  " . .  (x) ]  as 


(3.5)  t^H.  (x;t)  [p.  ,(t)  -  p.  +  t*p,  ,[H,(x;t)  -  Mx). 


3 y  a  well  known  convergence  theorem  (Cramer  [6jf  Section  20.6)  the 
Uniting  distribution  cf  (3.5)  is  the  save  as  the  Uniting  distributor 


of 

(3.6) 


®H.  (x)  [p.  .  (t)  -  p.  ]  +  t2p.  [H.  (x;t)  -  H  (x)  . 


With  the  particular  choice  w,  =  K, (x)  and  w0  =  p. (3.2)  is 
just  (3.6)  ar.d  the  proof  is  complete. 

The  asymptotic  normality  given  in  Corollary  3.3  can  be  extended 

o  the  finite  dimensional  distribution  of  the  r.v.’s  { W. ,,  = 

* 

.  ,(x  ;t)  -  Q  (x.  )  for  1  <  i,  j  <  m  and  1  £  k  <  s]  . 

Theorem  3.4:  For  fixed  s,  the  distribution  of  {t*W. 

1  £  i,  J  £  m,  1  £  k  £  3}  converges  in  law  as  t  —  00  to  an 
m^s— dimensional  normal  r.v.  with  zero  near,  and  covariance  matrix 

(a.  )  given  by 

r*  t  J  •  “ 

(3.8)  a.  =  n..6  p.  ,[H,(x  )o  +  H.  (min[x,  ,x  ]  )p, 

IjKjavrf  ll  x  u.  x  J  x  W  J  V  x  4  a  xv 


-  2H, (x,  )H. (x  )p,  ] , 

X  K  X  W  x  . 
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Proof:  Let  be  arbitrary  constants* 

It  will  suffice  to  3how  that 


(3.9) 


1=1  J=1  k=l 


converges  in  law  to  a  normal  r.v.  for  all  real 
rewrite  (3*9)  as 


We  may 


m 


m 


[t/N2(t)]  t“*N£  [N1(t)/M1(t)]22  Xijk 

j=l  k=l 


1=1 


•([N1?(t)  -  Pij:Ii(t)]H1(xk>t)  +  Pi.:.:i(t)[H1(xk|t)  -  H^(x^)] ), 


As  in  the  proof  of  Theorem  3*1,  the  expression  may  be  shown  to  have 
the  same  limiting  distribution  as 


rc  i.  s 

i=l  J=1  k=l 

•[N,  T(t)Ki(x]:)  +  PljNi(t)H1(xk;t)  -  2pljN1(t)Hi(xlc)]. 


This  in  turn  can  be  written  as  a  product  of  [t/N^(t)j  and  a  sum 
of  the  form  (2*4)  by  using  the  function  f  defined  by 


m  ms 


(3.10) 


?(r,s,7)  =  ^  2  ‘‘ii  2! 2  XiJk  &ri 


i=l  j=l  kFl 


[Hl(xk)6sj  +  pij  e(xk'y)  ’  2Hi(xk)pi 


For  this  function. 


lr  =  ,ln2,lii22XiJk6rf 
i=l  j=l  k=l 


’  [H. (x.  )p  .  +  ?. jH  (x.  )  -  2H  (x^)p  ]  = 

X  X.  X  .  x  >  X  A.  x  rv 


’or  1  ^  r  <  x;  hence  n. 


3U  =  0  and  the  third  sun  in  (2.?)  is 


Then  the  variance  of  is  given  by 


m 


31  3r  ^ll^rr 


r=l 


which  nay  be  shown  to  reduce  to 


'  2Bi(VHi(Vpi»pij 


0 


zero. 
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The  variance  <s^  is  finite,  so  by  the  sane  argument  as  in  Theorem 

3*1,  the  Uniting  distribution  of  (3.9)  is  normal  with  zero  mean 

2 

and  variance  The  required  covariance  matrix  (3*8)  is 

obtained  from  the  coefficients  of  ^uvw>  thereby  completing 

the  proof. 


Consider  a  renewal  process,  that  is,  an  MRP  with  one  state 
for  which  m  =  1,  o, .  =1,  ..(t)  =  II.  (t)  =  N(t).  From  Theorem 

xl  11  1 

3.4  the  limiting  distribution  as  t  -  »  of  N(t)^[H^(x^.  t)  -  H^(x^)]  for 
1  £  k  <  3  with  s  fixed,  is  normal  with  zero  means  and  covariance 
matrix  (a,,  )  lefined  by 

fC  y  V 


Consider  the  Markov  chain  obtained  from  the  MRP  by  letting 
the  holding  tines  be  degenerate  at  one,  that  is,  Ih(x)  =  «(x-l), 
^i'"  ~  *lii*  ?ror-  Theorem  3.4  one  obtains  that  as  t 
t*[ II.  .  (t)/ll.  (t)  -  p.  .  ]  for  1  <  i,  j<  m  converges  to  a  normal 

i.j  l  ij 

r.v.  with  zero  means  and  covariance  matrix  given  by 

ai j  ,uv  ~  ^ii  Pij^jv  ~  ^ivJ  * 

This  is  equivalent  to  Derma;. rs  result  on  the  limiting  distribution 

of  narij  .(n)/ll.  (n)  (c.f.  Billingsley  [$])• 

1 

4.  DEMSITY  FOR  A  MARKOV  REMITwAL  PROCESS 

A  ler.sity  for  an  MR?  is  defined  in  a  manner  similar  to  the 


definition  01  a  density  for  a  continuous  tine  Markov  process  by 

Billingsley  [4]  and  Albert  L Ij .  From  the  constructive  definition 

of  an  MRP  given  in  Section  2,  almost  all  sample  functions  for  an 

MRP  up  to  tine  t  can  be  represented  as  the  finite  tuple 

R(0  -  ,  -’-t  .  ) .  Almost  every  sample 

function  may  therefore  be  represented  as  a  point  in  w  =  M 

n=o  n 

where  is  the  (n+l)-fold  Cartesian  product  of 

*  L0*05)*  Let  be  the  product  Borel  field  on  i* 

generated  by  all  subsets  of  {_,2,««*,m.j  and  the  ordi nary  3orel 
sets  on  [o,°°).  Let  be  the  smallest  <3 -field  containing  each 

^n*  0  ^  r.  <  ®.  For  convenience  we  will  assume  that  the  underlying 
probability  s?ac»  on  which  the  MRP  is  defined  is  (i./ti .  Cn  this 
probability  space  the  measure  P  1.-  a3  follows. 

Theorem  L. 1;  For  any  n  2  0  and  integers  1  <  j  m  for 
0  ^  T  ^  n,  the  probability  measure  ?  on  (ilfCf)  is  given  by 


(4.1)  P[N(t)  -  n,  Jq  -  J0,  — ,  Jn  -  Jn,  ^  £  al9.*«9  Xn  £  o^] 


=  p.  /[I  -  H  (ut)]  "n  p  dH  (x,  ) 

Jo  /  Jn  k=o  JkJk+l  Jk  lc+" 


where  u.  =  t  -  x,  - 

^  12 


x  and 
n 


Cn  =  ( W’ 


*2» 


v»  x  ^  K  <  n  an 


u*  > 


0}. 


In  particular. 


u 


P[::(t)  =  o,  j 


-  » 


=  Pj  [1  -  Hj  (t)]. 


Proofs  From  (1*1)  the  conditional  distribution  of  (X^,  1  £  i  £  n] 
given  (j.  ,  0  £  i  £  n]  Is  that  of  n  independent  r.v.'s  with 
distribution  functions  HT  respectively.  The  proof  follows 

J  i 

immediately* 


The  density  of  the  process  can  now  be  exhibited  as  the  Radon- 
ilikodym  derivative  of  the  probability  distribution  (4*1)  with 
respect  to  the  measure  ie fined  as  follows.  Let  p  be  Lebesgue 
measure  on  let  \  be  count! nr  measure  on  {l,2,#**,m}, 

and  let  a  be  the  appropriate  product  measure  on  (k^,  CL)  •  For 

CD 

each  set  3  c  Cf  define  **(3)  =  (?„(?  fl  *^n)»  which  determines 

/  N  n=o 

a  measure  on 

The  density  is  now  set  forth  explicitly. 

Theorem  4.1;  If  each  H.  is  absolutely  continuous  with  density 
function  h,  ,  then  one  may  write 


f(v)  d  n*(v) , 


B  eCf, 


where 


(4.2) 


Pj1  *  Hi  W  *  -  =  JQ. 


,  /  n-i 

'(v)  =  {  p.  [1  -  K  (u  )]  n  p  h.  (x,x.) 
I  0  k=o  "k^k+l  <+~ 


if  v  “  (J0»***»jn*xi»***»xn)  with  ut  >  0 
otherwise 


and  where  ufc  =  t  -  ^  -  x,  -  ...  -  x 


Proof:  Let  the  conditional  density  of  (Jq,...,j_ )  with 
respect  to  en,  given  that  N(t)  =  n,  be  denoted  by 

gn^o,##*»Jn,xl,**#»Xn^  11113  conditional  density  exists  since 

under  the  stated  condition,  R(t)  is  a  vector  r.v.  of  fixed 
dimension  whose  coordinates  are  either  discrete  or  absolutely 
continuous.  By  Theorem  4.1,  P[N(t)  =  n]gn  must  coincide  a.e.  with 


f.  Thus,  for  3  *  Cf, 


P(B)  =2  f  f*ddr  =  f  r A«* 

n=o  B  Dm  3 

n 

and  so  f  is  the  required  density  with  respect  to 

For  the  3Pecial  case  of  exponential  holding  times  (i.e.  a 
continuous  Markov  process),  the  density  (4.2)  reduces  to  Albert's 
density  (c.f.  Theorem  3.2,  [lj).  For  a  renewal  process,  i.e.  an 
MRP  with  one  state,  the  sample  functions  are  of  the  form 

R(t)  =  (X1,X2,...,XN(t))  and  f°r  H(x)  absolutely  continuous. 
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(4*2)  reduces  to 


n 

(4*3)  f(v)  =  [l  -  H(ut)]  II  h(xi)  if  v  =  (x1,x2,***,xn). 

5.  MAXIMUM  LIKELIHOOD  ESTIMATION  FOR  A  RENEWAL  PROCESS 

Maximum  likelihood  estimators  (MLE)  may  be  obtained  by  maxi¬ 
mizing  (4*3)  over  a  selected  class  of  densities  for  an  observed 
sample  function  R(t)  =  ).  The  classes  of  densities 

considered  will  be  exponential,  increasing  failure  rate,  and 
non-increasing.  Throughout  the  remainder  of  the  paper  it  will  be 
assumed  that  H^x)  is  absolutely  continuous  (l  £  i  £  m)  and  that 
whenever  t  is  fixed,  N(t)  and  will  be  denoted  by  N  and 

U  respectively. 

a.  Exponential  density  with  parameter  X,  that  is 
h(x)  =  X  exp(-Xx). 

From  (4*3)  the  likelihood  function  is 

N 

L(v)  =  exp(-XU)  n  X  exp(-XX,  ) 
k=l  K 


and  the  log  likelihood  function  is 


N 

U  log  X  -  X[^]  \  +  U], 
k=l 


(5.1) 


17 


The  maximum  of  (5.1)  occurs  at  X  =  N/t,  so  the  MLE  for  h(x)  is 
given  by 

(5.2)  h(x)  =  [N/t]  exp[-Nx/t] • 

The  MLE  (5.2)  is  strongly  consistent  since  N/t  -*X(a.s.).  This 
example  is  the  well  loiown  one  of  the  Poisson  process  for  which 
the  estimator  of  X  is  the  same  for  a  fixed-time  sample  as  for 
a  fixed-number-of-events  sample. 

b.  Increasing  failure  rate  (IFR)  densities,  that  is  the  class 
of  densities  for  which  the  failure  rate  q(y)  =  h(y)/[l  -  H(y)]  is 
increasing.  Marshall  and  Pros  chan  [  10  ]  and  Grenander  [9]  have 
derived  the  MLE  for  q(x)  based  on  a  sample  of  non-random  size 
(i.e.  U  =  0  and  N(t)  =  n)  to  be 

0  for  y  <  Y1 

nin  max  (v-u)  [  (n-u)  (Yu+1-Yu)  +  •••  +  (n-v+l)  (Y^-Y^)]"1 
v^i+i  u<;i 

for  Yi  ^  y  <  Yi+1(l  ^  i  £  n-l) 

®  for  y  >  Y 

n 

where  (Y^j  1  £  i  <,  n]  are  (X^j  1  <,  i  <>  n]  arranged  in  increasing 
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order#  By  an  argument  similar  to  the  one  used  in  [10],  the  MLE 
for  q(x)  can  be  derived  Tor  a  renewal  process# 

Theorem  5  #1:  Let  ,  T  ,•••,  Y,,)  be  an  ordered  sample  from 

an  IFR  renewal  process#  If  Y^  £  U  <  Y^  for  1  £  iQ  £  N  -  1 

o  o 

or  U  >  Y,t  and  i  =  N  then  the  MLE  of  q(y)  is  given  by 

M  0 


0 


for  y  <  Y. 


(5*4)  q(y)  = 


min  max  (v-u)  [c^  +  +  for  Y.  ^  y  <  Y 

v^i+1  u^i 


'v-1- 


1  -  '  -  i+1 

(1  ^  i  ^  N-l) 


for  y  ^  Y. 


N 


where 


Ow+Dd^i  -  it) 


for  1  £  i  £  i 


(5.5)  ^  = 


+l  -h)  +  (’J-  Yi  >  for  1  =  b 

CO  o 


("-i){Y1+1  -  T  ) 


for  1Q  <  1  £  M. 


If  U  <  Y1,  q(y)  is  given  by  (5.3). 


Proof:  Since  h  =  q  exp(-Q)  and  1  -  H  =  exp(-Q)  where 

/y 

q(z)  dz,  the  log  likelihood 


function  can  be  written  as 
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N  N 

(5*6)  log  L  =2  loe  q  (\)  *2  Q(i. )  -  Q(U). 

i=l 


For  q(y)  increasing. 


-  •  n 

2  X\)  2  V  (M-i)(Yltl  -  I  )  o  (T  ) 

i=l  i=l 


and 


i  -1 

0(U)  *  ^  (Yi+i  -  I.)  q  (Yi)  +  (U  -  Y  )  q  (Y.  ), 
i=2  ~  o  1o 


Let  [c^  1  <(  i  ^  N}  be  defined  by  (5.5).  From  (5.6) 

N  N— 1 

(5.7)  log  L  ^2  loe  q('^)  Cj  q(Y, ). 

i=l  i?l 

Without  the  restriction  that  ,0^)  <;  ,(!.,)  <;  ...  <;  tfce 

Of  the  right  hand  side  of  (5.7)  is  achieved  for  q(Yj  =  o;1  for 

1  ^  1  ^  N#  ^For  1  =  Is  not  defined,  but  the  limiting 

argument  used  in  [10]  to  obtain  (5.3)  can  be  applied  to  get  c,T  =  -.) 
However,  q(Y1)  <  q(Y.)  <  ...  <  q(Y.;)  defines  a  convex  set  and  the 
right  side  of  (5.7)  satisfies  Brunk’s  conditions,  so  Brunk’s  result 
(Corollary  2.1,  [3])  can  be  applied  to  obtain  the  maximum  at  (5.4). 

If  U  <  j^,  Q(U)  ^  0  and  (5.7)  reduces  to  the  corresponding 
statement  ior  U  =  0,  which  is  maximized  by  (5.3). 
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The  MLE  for  h(y)  is  obtained  from  (5.3)  or  (5.4)  in  the 
natural  way,  that  is 


h(y)  -  q(y)  ©xp  [- J *  q(z)  dz]. 
o 

The  MLE  (5.4)  can  be  shown  to  be  a  consistent  estimator,  so 
that  (5.8)  is  a  consistent  estimator  of  h(y). 

Theorem  5,2:  If  q(y)  is  increasing,  then  for  every  t 

q(t”)  <  lira  inf  q(tQ)  <  lim  sup  q(tQ)  £  q(t+). 

P^oof:  The  proof  follows  directly  from  the  consistency  of  (5.3) 

(c.. .  Theorem  4.-,  t10^)  after  the  ooservation  that 

(N-i)(Yi+i  -  V  *  ci  ^  (N-i+Dd^-^)  1  *  i  ^  N-l* 

This  type  of  solution  has  also  been  obtained  for  the  MLE  of  a 
decreasing  failure  rate  density  for  non-random  sample  size  (c.f. 
Section  6,  [10]). 

c*  ‘lon-increasing  densities,  that  is  the  class  of  densities 
for  which  h(xj  ^  h(x2)  if  ^  <  x2.  For  non-random  sample  size, 
Grenander  [9]  has  derived  for  this  case  the  MLE  for  a  density  h(y) 
and  for  the  corresponding  distribution  function  H(y)  (c.f.  3.1,  [9]). 
For  an  ordered  sample  (Y^  ...,  Yn)  of  fixed  size  the  MLE  of  H(y) 


is  the  smallest  concave  majorar.t  of  the  empirical  distribution 
function.  The  MLE  for  h(y)  can  be  written  in  the  same  form  as 
(5.3),  namely 


(5.9)  h(y)  = 


max  min  n_1[(v-u)(Yv  -  Y  )], 
v^i+1  u$i 


if 


Yi^Yi+l 
(0  £  i  £  n-1) 


if  y>Yn 


where  Yq  is  the  left  end  point  of  the  support  of  H(y). 

:or  a  renewal  process  with  a  non-increasing  density,  a  MLE 
can  be  obtained  within  the  class  X  =  (h(x)  i  £  l]  .  Let 

^1*  ***»  ^(t)^  be  a  sa3T1Ple  fror-  a  renewal  process  over  [0,t], 
and  let  (Y^  ...»  Yfi)  denote  [x.  :  1  <  i  £  N(t)  =  n]  arranged 
in  increasing  order.  Let  JfQ  be  the  subclass  of  non-increasing 
densities  h  e  X  which  satisfy 

T, 

h(y)  dy  =  a. 

for  some  fixed  constants  For  1  £  i  £  n  and 

<  7  £  define 

(5*10)  h*(y)  =  a./(Yi  -  lw)  =  h, 


and  for  y  >  Yn  let  h«(y)  be  any  function  which  is  non-increasing 
°n  ^Yn'"^  M(1  satisfies 
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dy  £  a. 


For  h  c  5 one  has 


^  -  H±(u)]  <;  or  n  hi  E  SO*,^,***,^) 

that  j. s,  the  maximum  of  the  likelihood  Function  over  *  is 
attained  at  a  density  of  the  form  (5.10)  for  some  choice  of  the 
constants  a,,  1  £  i  <  n.  Thus  the  MLE  for  he*  is  obtained 
by  maximizing  over  a._  c  *  for  which  the  a.,  *  s  are  non— 
increasing,  opecifically  the  MLE  for  h  e  *  is  that  function  h 
which  maximizes  i(a,h1,«**,hn)  subject  to 


(5.11) 


0  ^  of  ^  lj  h.  j>  hp  j>  •••  ^  h  ^0^ 


(5.12) 


U 

J  h*(y)dy  =  1  -  a,  ^  h*(y)dy  <  a. 


^  ^i  _i  ^  u  ^  \  ^  ^  ^  f1-*  (5.11)  can  be  written 


as 


(5.13) 


2hi(VYi-l>  +  hi  (u  -  ,)  =  1  -  a 

O  n 


i=l 


n  <» 

2hi(Yi-Yi-l}  +  hi  (Y1  -U)  +  /*  h»(x)  dx  £  «. 

0  0  ^ 


i=i  +1 
o 


(5 .14) 


23 


The  integral  tera  in  (5.14)  car.  be  set  equal  to  zero  without 
affecting  the  likelihood,  so  ( 5  • 14 )  becomes 


n 

(5.!5)  V  hi(Yi-Yi  x)  +  h.  (Y  -U)  <  a. 

i=i  +1  00 

o 

But  i(a,h1,»»»,hn)  satisfies  Brunk's  conditions  and  (5.11), 
(5.13) ,  and  (5.1?)  define  a  convex  set,  so  Brunk’s  iterative  pro¬ 
cedure  (c.f.  Corollary  2.1,  [3.)  yields  the  required  maximum. 

If  U  >  Y__,  (5.12)  can  be  written  as 


(5.16) 


a 


and 


< 

/ 


h*(y)  dy  <; 


Pick  h* (y)  to  be  zero  for  y  >  Y^.  Then  (5.16)  can  be  written 

(5*17)  2  hi^i  “  "i-1  ^  =  1  “  af  f  h*(y)  d7  =  0. 

1=1  U 


Again  (5.1l)  and  (5.1^)  define  a  convex  set  and  3runk’s  procedure 
can  be  applied. 
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The  h*  chosen  will  possibly  have  mass  at  y  =  »,  which 
should  not  be  surprising  since  U  >  Y  represents  the  information 
that  there  is  an  observation  larger  than  all  the  other  observations. 
The  arbitrary  character  of  h*  for  y  >  Yn  results  in  a  similar 
arbitrariness  in  the  MLS.  For  y  >  Yfl  the  MLE  can  be  extended  in 

any  manner  which  is  non-increasing  and  which  maintains  the  required 
area. 

MAXIMUM  LIKELIHOOD  ESTT-IAHON  FOR  A  MARKOV  RENEWAL  PROCESS 
Throughout  this  section,  -.rite  II ( t )  -  N,  ufc  =  U,  =  j 

whenever  t  is  fixed.  From  (4.2)  the  likelihood  function  for  a 
sample  function  (J^  ,  —  is 


(6.1) 

which  may  be  rewritten  as 


N-l 


L  ~  PJ  ^  -  VU)J  n  p,-  «  h  (X.  ) 

0  k=o  -k^k+l  -k 


(f>2)  j  _  “  m  N  (t)  »  V*) 

\  i=i  i  iW 


Consider  a  maximum  likelihood  problem  for  which  the  quantities 
fPikF  1  i  i,  Vi  m)  and  K(x),  1  $  i  ^  ,|  are  not  fictionally 
depandeat..  The  likelihood  function  then  Pastors  into  two  parts 
given  by 


PJ  fn  n  Pik 
o  i=l  k=l  1K 


(6.3) 


Hj(t) 


n  n  h  (x  ) 

i=l  k=l  1  ^ 


which  can  be  separately  maximized.  If,  furthermore,  the  H. «s 
themselves  are  not  functionally  dependent,  then  (6.4)  can  be 
^actored  into  m  parts  given  by 


yt) 

(6.5) 

n  h  (X  )  , 

k=l  1 

and 

Nj(t) 

(6.6) 

[1  -  H  (U)3  n 

k=l 

which  can  be  maximized  separately. 

Thus  the  problem  of  obtaining  snMLE  for  an  MRP  in  which 

(pij)»  Hl»  H2»  Hn  are  "0*  functionally  related,  reduces  to 
three  separate  maximum  likelihood  problems:  (i)  the  problem  of 
maximizing  (6.3)  which  is  equivalent  to  finding  the  MLS  p,  f  0f  the 
transition  matrix  of  a  Markov  chain,  (ii)  the  problem  of  maximizing 

(6.5)  which  is  equivalent  to  finding  the  MLE  H.  for  m-1  densities 
based  on  non-random,  sample  sizes,  (iii)  the  problem  of  maximizing 

(6.6)  which  is  equivalent  to  finding  the  MLE  H  of  the  density  of 
a  renewal  process.  Solutions  of  problem  (i)  have  been  obtained  by 
Billingsley  [4] .  Problem  (ii)  i3  just  the  classical  maximum  likeli¬ 
hood  problem  for  which  solutions  are  well  known..  The  solution  of 
problem  (iii)  has  been  obtained  for  a  few  cases  in  Chapter  5. 
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In  particular  the  MLS  for  an  element  Q^j(x)  of  the  transition 
distribution  matrix  is  given  by 

P„(t)  Hjtat)  if  1  =  J„(t) 

Pyft)  Hs(„t)  if  i  * 

If  a  functional  relation  exists  between  (p,  .),  fL  ,  H_,  •••,  H 

ij  1  c.9  in 

the  problem  is  much  more  difficult. 


REFERENCES 


1.  Arthur  Albert,  Estimating  the  infinitesimal  generator  of  a 

continuous  time,  finite  state  Markov  nrocess,  Ann.  Math « 
Stat..  vol.  33  (1962),  pp.  727-753. 

2.  M.  S.  Bartlett,  The  frequency  goodness  of  fit  test  for  pro¬ 

bability  chains,  Proc.  Camb.  Phil.  Soc..  vol.47  (1951) , 
pp.  86-95. 

3.  H.  D.  Brunk,  On  the  estimation  of  parameters  restricted  by 

ineaualities ,  Ann.  Math.  Stat..  vol.  29  (1958),  pp. 

437-454. 

4.  Patrick  Billingsley,  Stay.'  svical  inference  for  Markov  processes. 

Institute  of  Mathematical  Statistics,  University  of  Chicago 
Statistical  Research  Monographs,  University  of  Chicago 
Press,  Chicago,  1961. 

5.  Patrick  Billingsley,  Statistical  methods  in  Markov  chains, 

Ann.  Math.  Stat.,  vol.  32  (1961) ,  pp.  12-40. 

6.  Harold  Cramer,  i'aohemati cal  methods  -v’  statistics.  Princeton 

University  Press,  1946. 

7.  D.  A.  Darling,  The  Kolmogorcv-Srairnov,  Cramer- von  Kises 

tests,  Ann.  Math.  2 tat.,  vol.  28  (1957),  pp.  823-338. 

8.  Cyrus  Derman,  Some  Asymptotic  distribution  theory  for  Markov 

chains  with  a  denumerable  number  of  states,  3iometrlka. 

vol.  43  (1956),  pp.  285-294. 

9.  Ulf  Grenander,  On  the  theory  of  mortality  measurement.  Part 

II,  Skand.  Akt marie tl  2'kr,,  vol.  39  (1956),  pp.  125-153. 

10.  Albert  Marshall  ar.d  Frank  Proschan,  Maximum  likelihood  esti¬ 

mation  for  distributions  with  monotone  failure  rate,  to 
appear  in  Ann.  Math.  .1  tat. 

11.  Ronald  Pyke,  Markov  renewal  processes:  definitions  and 

preliminary  properties.  Ann.  Math.  Stat..  vol.  32  (1961 ), 
pp.  1231-1242. 

12.  Ronald  Pyke,  Limit  theorems  for  Markov  renewal  processes,  to 

appear  in  Ann.  Math.  Stat..  vol.  35  (1964). 


